Enrico Zini: Pipelining
This is part of a series of posts on ideas for an ansible-like provisioning
system, implemented in Transilience.
Running actions on a server is nice, but a network round trip for each action
is not very efficient. If I need to run a linear sequence of actions, I can
stream them all to the server, and then read replies streamed from the server
as they get executed.
This technique is called pipelining
and one can see it used, for example, in
Redis,
or Mitogen.
Roles
Ansible has the concept of "Roles" as a series of related tasks: I'll play with
that. Here's an example role to install and setup
fail2ban:
I prototyped roles as classes, with methods that push actions down the
pipeline. If an action fails, all further actions for the same role won't
executed, and will be marked as skipped.
Since skipping is applied per-role, it means that I can blissfully stream
actions for multiple roles to the server down the same pipe, and errors in one
role will stop executing that role and not others. Potentially I can get
multiple roles going with a single network round-trip:
That looks like a playbook, using Python as glue rather than YAML.
Decision making in roles
Besides filing a series of actions, a role may need to take decisions based on
the results of previous actions, or on facts discovered from the server. In
that case, we need to wait until the results we need come back from the server,
and then decide if we're done or if we want to send more actions down the pipe.
Here's an example role that installs and configures Prosody:
This files some general actions down the pipe, with a hook that says: when the
results of this action come back, run
Here we want to update Apt's cache, which is a slow operation, only after we
actually write
This is the same playbook run with Ansible speeded up via the Mitogen backend,
which makes Ansible more bearable:
This is the same playbook ported to Transilience:
Doing nothing went from 2 minutes down to 3 seconds!
That's the kind of running time that finally makes me comfortable with
maintaining my VPS by editing the playbook only, and never logging in to mess
with the system configuration by hand!
Next steps
I'm quite happy with what I have: I can now maintain my VPS with a simple
script with quick iterative cycles.
I might use it to develop new playbooks, and port them to ansible only when
they're tested and need to be shared with infrastructure that needs to rely on
something more solid and battle tested than a prototype provisioning system.
I might also keep working on it as I have more interesting ideas that I'd like
to try. I feel like Ansible reached some architectural limits that are hard to
overcome without a major redesign, and are in many way hardcoded in its
playbook configuration. It's nice to be able to try out new designs without
that baggage.
I'd love it if even just the library of Transilience actions could grow, and
gain widespread use. Ansible modules standardized a set of management
operations, that I think became the way people think about system management,
and should really be broadly available outside of Ansible.
If you are interesting in playing with Transilience, such as:
class Role(role.Role): def main(self): self.add(builtin.apt( name=["fail2ban"], state="present", )) self.add(builtin.copy( content=inline(""" [postfix] enabled = true [dovecot] enabled = true """), dest="/etc/fail2ban/jail.local", owner="root", group="root", mode=0o644, ), name="configure fail2ban")
#!/usr/bin/python3 import sys from transilience.system import Mitogen from transilience.runner import Runner @Runner.cli def main(): system = Mitogen("my server", "ssh", hostname="server.example.org", username="root") runner = Runner(system) # Send roles to the server runner.add_role("general") runner.add_role("fail2ban") runner.add_role("prosody") # Run until all roles are done runner.main() if __name__ == "__main__": sys.exit(main())
from transilience import actions, role from transilience.actions import builtin from .handlers import RestartProsody class Role(role.Role): """ Set up prosody XMPP server """ def main(self): self.add(actions.facts.Platform(), then=self.have_facts) self.add(builtin.apt( name=["certbot", "python-certbot-apache"], state="present", ), name="install support packages") self.add(builtin.apt( name=["prosody", "prosody-modules", "lua-sec", "lua-event", "lua-dbi-sqlite3"], state="present", ), name="install prosody packages") def have_facts(self, facts): facts = facts.facts # Malkovich Malkovich Malkovich! domain = facts["domain"] ctx = "ansible_domain": domain self.add(builtin.command( argv=["certbot", "certonly", "-d", f"chat. domain ", "-n", "--apache"], creates=f"/etc/letsencrypt/live/chat. domain /fullchain.pem" ), name="obtain chat certificate") with self.notify(RestartProsody): self.add(builtin.copy( content=self.template_engine.render_file("roles/prosody/templates/prosody.cfg.lua", ctx), dest="/etc/prosody/prosody.cfg.lua", ), name="write prosody configuration") self.add(builtin.copy( src="roles/prosody/templates/firewall-ruleset.pfw", dest="/etc/prosody/firewall-ruleset.pfw", ), name="write prosody firewall") # ...
self.have_facts()
.
At that point, the role can use the results to build certbot
command lines,
render prosody's configuration from Jinja2 templates, and use the results to
file further action down the pipe.
Note that this way, while the server is potentially still busy installing
prosody, we're already streaming prosody's configuration to it.
If anything goes wrong with the installation of prosody's package, the role
will be marked as failed and all further actions of the same role, even those
filed by have_facts()
will be skipped.
Notify and handlers
In the previous example self.notify()
also appears: that's my attempt to
model the equivalent of Ansible's handlers. If any of the actions inside the
with
produce changes, then the RestartProsody
role will be executed,
potentially filing more actions ad the end of the playbook.
The runner will take care of collecting all the triggered role classes in a
set,
which discards duplicates, and then running
the main()
method of all resulting roles, which will cause more actions to be
filed down the pipe.
Action conditions
Sometimes some actions are only meaningful as consequences of other actions.
Let's take, for example, enabling buster-backports
as an extra apt source:
a = self.add(builtin.copy( owner="root", group="root", mode=0o644, dest="/etc/apt/sources.list.d/debian-buster-backports.list", content="deb [arch=amd64] https://mirrors.gandi.net/debian/ buster-backports main contrib", ), name="enable backports") self.add(builtin.apt( update_cache=True ), name="update after enabling backports", # Run only if the previous copy changed anything when= a: ResultState.CHANGED , )
/etc/apt/sources.list.d/debian-buster-backports.list
. If the
file was already there from a previous run, we can skip downloading the new
package lists.
The when=
attributes adds an annotation to the action that is sent town the
pipeline, that says that it should only be run if the state of a previous
action matches the given one.
In this case, when on the remote it's the turn of "update after enabling
backports", it gets skipped unless the state of the previous "enable backports"
action is CHANGED
.
Effects of pipelining
I ported enough of Ansible's modules to be able to run the provisioning scripts
of my VPS entirely via ansible.
This is the playbook run as plain Ansible:
$ time ansible-playbook vps.yaml [...] servername : ok=55 changed=1 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0 real 2m10.072s user 0m33.149s sys 0m10.379s
$ export ANSIBLE_STRATEGY=mitogen_linear $ time ansible-playbook vps.yaml [...] servername : ok=55 changed=1 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0 real 0m24.428s user 0m8.479s sys 0m1.894s
$ time ./provision [...] real 0m2.585s user 0m0.659s sys 0m0.034s
- polishing the packaging, adding a
setup.py
, publishing to PIP, packaging in Debian - adding example playbooks
- porting more Ansible modules to Transilience actions
- improving the command line interface
- test other ways to feed actions to pipelines
- test other pipeline primitives
- add backends besides Local and Mitogen
- prototype a parser to turn a subsets of YAML playbook syntax into transilience actions
- adopt it into your multinational organization infrastructure to speed up provisioning times by orders of magnitude at the cost of the development time that it takes to turn this prototype into something solid and road tested
- create a startup and get millions in venture capital to disrupt the provisioning ecosystem